Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy

نویسندگان

  • Martin Rajman
  • Pierre Andrews
  • Florian Seydoux
چکیده

Automatic indexing is one of the important technologies used for Textual Data Analysis applications. Standard document indexing techniques usually identify the most relevant keywords in the documents. This paper presents an alternative approach that aims at performing document indexing by associating concepts with the document to index instead of extracting keywords out of it. The concepts are extracted out of the EDR Electronic Dictionary that provides a concept hierarchy based on hyponym/hypernym relations. An experimental evaluation based on a probabilistic model was performed on a sample of the INSPEC bibliographic database and we present the promising results that were obtained during the evaluation experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Indexing With a Concept Hierarchy

We discuss the task of selection of the concepts that describe the contents of a given document. We propose to use a large hierarchical concept dictionary (thesaurus) for this task. A statistical method of document indexing driven by such a dictionary is proposed. The problem of handling non-terminal nodes in the hierarchy is discussed. Common sense-complaint methods of automatically assigning ...

متن کامل

Thematic Annotation: extracting concepts out of documents

Semantic document annotation may be useful for many tasks. In particular, in the framework of the MDM project, topical annotation – i.e. the annotation of document segments with tags identifying the topics discussed in the segments – is used to enhance the retrieval of multimodal meeting records. Indeed, with such an annotation, meeting retrieval can integrate topics in the search criteria offe...

متن کامل

Document Indexing with a Concept Hierarchy Índice de Documentos con una Jerarquía de Conceptos

Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semi-automatic translation of the hierarchy into different languag...

متن کامل

Indexing with a Concept Hierarchy

Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semiautomatic translation of the hierarchy into different language...

متن کامل

تأملاتی بر نمایه‌ سازی تصاویر: یک تصویر ارزشی برابر با هزار واژه

Purpose: This paper presents various  image indexing techniques and discusses their advantages and limitations.             Methodology: conducting a review of the literature review, it identifies three main image indexing techniques, namely concept-based image indexing, content-based image indexing and folksonomy. It then describes each technique. Findings: Concept-based image indexing is te...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005